AITopics | louis-philippe morency

Collaborating Authors

louis-philippe morency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Multimodal Multilinear Fusion with High-order Polynomial Pooling

Ming Hou, Jiajia Tang, Jianhai Zhang, Wanzeng Kong, Qibin Zhao

Neural Information Processing SystemsFeb-15-2026, 03:24:39 GMT

More importantly, simply fusing features all at once ignores the complex local intercorrelations, leading to the deterioration of prediction. In this work, we first propose a polynomial tensor pooling (PTP) block for integrating multimodal features by considering high-order moments, followed by a tensorized fully connected layer. Treating PTP as a building block, we further establish a hierarchical polynomial fusion network (HPFN) to recursively transmit local correlations into global ones.

artificial intelligence, machine learning, window, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Japan (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

TrustworthyMultimodalRegression withMixtureofNormal-inverseGammaDistributions

Neural Information Processing SystemsFeb-8-2026, 05:45:31 GMT

For example, the autonomous driving systems are usually equipped with multiple sensors to collect information from different perspectives [2]. In medical diagnosis [3], multimodal data usually come from different types of examinations, typically including a variety of clinical data.

artificial intelligence, machine learning, modality, (20 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Asia > Singapore (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (0.34)

Industry: Health & Medicine > Diagnostic Medicine (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model Wenbing Li Hang Zhou Junqing Y u Zikai Song Wei Y ang

Neural Information Processing SystemsNov-19-2025, 15:16:57 GMT

The essence of multi-modal fusion lies in exploiting the complementary information inherent in diverse modalities.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

6e09c213ac18d6375704a4f3ea75c4f8-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:31:37 GMT

experiment, fusion, mamba, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.67)
Information Technology (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Beyond Simple Fusion: Adaptive Gated Fusion for Robust Multimodal Sentiment Analysis

Wu, Han, Sun, Yanming, Yang, Yunhe, Wong, Derek F.

arXiv.org Artificial IntelligenceOct-3-2025

Multimodal sentiment analysis (MSA) leverages information fusion from diverse modalities (e.g., text, audio, visual) to enhance sentiment prediction. However, simple fusion techniques often fail to account for variations in modality quality, such as those that are noisy, missing, or semantically conflicting. This oversight leads to suboptimal performance, especially in discerning subtle emotional nuances. To mitigate this limitation, we introduce a simple yet efficient \textbf{A}daptive \textbf{G}ated \textbf{F}usion \textbf{N}etwork that adaptively adjusts feature weights via a dual gate fusion mechanism based on information entropy and modality importance. This mechanism mitigates the influence of noisy modalities and prioritizes informative cues following unimodal encoding and cross-modal interaction. Experiments on CMU-MOSI and CMU-MOSEI show that AGFN significantly outperforms strong baselines in accuracy, effectively discerning subtle emotions with robust performance. Visualization analysis of feature representations demonstrates that AGFN enhances generalization by learning from a broader feature distribution, achieved by reducing the correlation between feature location and prediction error, thereby decreasing reliance on specific locations and creating more robust multimodal feature representations.

machine learning, natural language, sentiment analysis, (19 more...)

arXiv.org Artificial Intelligence

2510.01677

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Macao (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.87)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.66)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.66)

Add feedback

DyKen-Hyena: Dynamic Kernel Generation via Cross-Modal Attention for Multimodal Intent Recognition

Wang, Yifei, Wang, Wenbin, Luo, Yong

arXiv.org Artificial IntelligenceSep-15-2025

Though Multimodal Intent Recognition (MIR) proves effective by utilizing rich information from multiple sources (e.g., language, video, and audio), the potential for intent-irrelevant and conflicting information across modalities may hinder performance from being further improved. Most current models attempt to fuse modalities by applying mechanisms like multi-head attention to unimodal feature sequences and then adding the result back to the original representation. This process risks corrupting the primary linguistic features with noisy or irrelevant non-verbal signals, as it often fails to capture the fine-grained, token-level influence where non-verbal cues should modulate, not just augment, textual meaning. To address this, we introduce DyKen-Hyena, which reframes the problem from feature fusion to processing modulation. Our model translates audio-visual cues into dynamic, per-token convolutional kernels that directly modulate textual feature extraction. This fine-grained approach achieves state-of-the-art results on the MIntRec and MIntRec2.0 benchmarks. Notably, it yields a +10.46% F1-score improvement in out-of-scope detection, validating that our method creates a fundamentally more robust intent representation.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.0994

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.48)

Add feedback

Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate

Zheng, Liangwei Nathan, Zhang, Wei Emma, Guo, Mingyu, Xu, Miao, Maennel, Olaf, Chen, Weitong

arXiv.org Artificial IntelligenceAug-26-2025

Effectively managing missing modalities is a fundamental challenge in real-world multimodal learning scenarios, where data incompleteness often results from systematic collection errors or sensor failures. Sparse Mixture-of-Experts (SMoE) architectures have the potential to naturally handle multimodal data, with individual experts specializing in different modalities. However, existing SMoE approach often lacks proper ability to handle missing modality, leading to performance degradation and poor generalization in real-world applications. We propose ConfSMoE to introduce a two-stage imputation module to handle the missing modality problem for the SMoE architecture by taking the opinion of experts and reveal the insight of expert collapse from theoretical analysis with strong empirical evidence. Inspired by our theoretical analysis, ConfSMoE propose a novel expert gating mechanism by detaching the softmax routing score to task confidence score w.r.t ground truth signal. This naturally relieves expert collapse without introducing additional load balance loss function. We show that the insights of expert collapse aligns with other gating mechanism such as Gaussian and Laplacian gate. The proposed method is evaluated on four different real world dataset with three distinct experiment settings to conduct comprehensive analysis of ConfSMoE on resistance to missing modality and the impacts of proposed gating mechanism.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.19525

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > Queensland (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.38)

Add feedback

Generalizable Multi-Linear Attention Network

Neural Information Processing SystemsAug-22-2025, 00:07:31 GMT

The majority of existing multimodal sequential learning methods focus on how to obtain powerful individual representations and neglect to effectively capture the multimodal joint representation. Bilinear attention network (BAN) is a commonly used integration method, which leverages tensor operations to associate the features of different modalities.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Leisure & Entertainment (0.46)

Technology: